Skip to main content

Running Test & Viewing Report

In this section, you will learn how to select an AI model to test and validate your dataset version.

Testing the AI Model

To run test and view reports, do the following:

  • Click the Run Test button to start the process of testing and validating an AI model.

Run Test

The AI Model is successfully submitted...

Run Test

After a few minutes, the test starts running on the AI Model

Run Test

After the test is successfully completed, you can view the F1, Precision, Recall scores of the test.

For a detailed report, proceed to click View Report button.

Run Test

Viewing Report

In this section, you will get to know how to view the AI model performance report.

To view report, do the following:

  • Click the View Report button.

    The Test Results dialog box is displayed.

    View Report

Understanding the Test Report

In this section, you will know how to read and understand the AI model performance report.

Summary of Key Metrics

When you run a model test in the platform, the platform calculates the following core AI evaluation metrics. These metrics denote how well the AI Model is detecting your target objects (S300 in this case) as compared to the ground-truth dataset.

MetricValueDefinitionWhy It Matters
F1 Score95.0%Harmonic mean of Precision and Recall.
2 × (Precision × Recall) / (Precision + Recall)
Since Precision = Recall, F1 is equal to both. Indicates balanced performance with no trade-off.
Precision95.0%Correctly predicted positive cases out of all predicted positives.
TP / (TP + FP)
95% of predictions labeled as positive (s300) were actually correct → very few false alarms.
Recall95.0%Correctly predicted positive cases out of all actual positives.
TP / (TP + FN)
95% of actual s300 instances were detected → only one or two were missed.

Confusion Matrix – Plot

Summarizes classification performance using a confusion matrix across multiple categories.

Confusion Matrix_Plot

Predicted: s300Predicted: backgroundPredicted: *background
Actual: s30017 (True Positive - TP)0 (False Negative - FN)1 (FN / misclassification)
Actual: background000
Actual: *background1 (False Positive - FP)00

Interpretation

  • 17 TP: Model correctly identified 17 instances of s300.
  • 1 FN: One s300 was missed and classified as *background.
  • 1 FP: A background instance was incorrectly labeled as s300.
  • Balanced performance: Precision, Recall, and F1 Score are all 95%, showing that the model performs consistently across detection and error handling.

💡 Use Case: Identify both misclassifications and missed detections while confirming balanced model performance.

Confusion Matrix – Visualize

Provides a geospatial visualization of actual vs predicted detections on imagery.

Confusion Matrix_

ElementDescription
Dataset ListChoose files from the dataset (e.g., ZIP archives) to view in the map.
Satellite MapOverlays actual annotations (red) and model predictions (blue).
Category ToggleEnable or disable categories such as s300 and *background.
Confidence SliderAdjust threshold to filter predictions based on model confidence scores.

💡 Use Case: Helps users visually inspect and confirm whether detections align correctly with ground truth objects on the imagery.

Dataset & Category Score

Displays evaluation metrics per dataset and per category, enabling fine-grained analysis.

Confusion Matrix_

Score Table

DatasetF1 ScorePrecisionRecall
Version 195.0%95.0%95.0%
└── s30095.0%95.0%95.0%
└── background95.0%95.0%95.0%
└── *background95.0%95.0%95.0%

Interpretation

  • All three metrics (F1, Precision, Recall) are identical (95%), indicating balanced model behavior.
  • The performance is consistent across classes, suggesting the model handles positives (s300) and negatives (background classes) without bias.

💡 Use Case: Use this tab to confirm balanced accuracy across multiple categories, ensuring the model is equally reliable in detecting and rejecting classes.